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Abstract: One may consider three types of statistical inference: Bayesian, fre- 
quentist, and group invariance-based. The focus here is on the last method. We 
consider the Poisson and binomial distributions in detail to illustrate a group 
invariance method for constructing inferred distributions on parameter spaces 
given observed results. These inferred distributions are obtained without using 
Bayes' method and in particular without using a joint distribution of random 
variable and parameter. In the Poisson and binomial cases, the final formulas 
for inferred distributions coincide with the formulas for Bayes posteriors with 
uniform priors. 



1. Introduction 

The purpose of this paper is to construct a probability distribution on the param- 
eter space given an observed result (inferred distribution) in the case of a discrete 
random variable with a continuous parameter space using group theoretic meth- 
ods. We present two examples, the Poisson and binomial distributions. From the 
point of view of posterior Bayes distributions, these group theoretic methods lead 
to uniform non-informative prior distributions on canonical parameters for both 
the Poisson and binomial cases. Alternatively, posterior distributions are obtained 
here from group theory alone without explicitly using Bayes method. 

The construction of inferred probability distributions by non-Bayesian methods 
has a long history beginning with Fisher's fiducial method of inference. The use of 
group theoretic methods to construct pivotal functions also has a long history as 
introduced in Fraser (1961) and amplified by many others since then. 

Briefly, the group theoretic or "invariance" method of inference has operated 
from a context in which a group G acts upon both the parameter space and the 
sample space. Consider the description as given in Eaton (1989). Let (X,23) repre- 
sent a given measurable space associated with random variable X having probability 
distribution Qq. Assume that there is an action of group G on the sample space 
X of the random variable. There is an induced action of G on probability distribu- 
tions. Thus, if X has probability distribution Qq then define gQo as the probability 
distribution of the random variable gX. (This is done similarly in Fraser (1961) 
for probability distributions of sufficient statistics.) Then consider the collection of 
probability distributions {gQo\g G G}. If G is a Lie group (i.e. parameterized) then 
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we have a collection of probability distributions indexed by the group parameters 
and an associated invariant measure on the parameter space. 

The salient feature of this (essentially) pivotal method is an isomorphism between 
three entities: the group G, the parameter space, and the statistic sample space X. 
Clearly it is not applicable in the case of discrete distributions with continuous 
parameter spaces. 

Group invariance methods have also been used to obtain reference priors for 
Bayesian posterior distributions. A comprehensive review on the selection of prior 
distributions is given in Kass and Wasserman (1996). In their section on invariance 
methods, description is given in which the group G acts on both the parameter space 
and the sample space as outlined above. Again, those methods are not applicable 
to discrete distributions with continuous parameter spaces. 

The unique contribution made in this paper is a group theoretic invariant method 
of inference which is indeed applicable in the discrete case. Also, we describe two 
examples, namely, the Poisson and binomial distributions. In this method, the (cho- 
sen) group G acts on the parameter space but not necessarily on the sample space 
and we do not construct a pivotal function. Yet the group theory still leads us to 
an inferred distribution on the parameter space given an observed value of the ran- 
dom variable. Also, from the Bayesian point of view, our group invariance method 
provides reference priors in the case of discrete random variables. The key to the 
method is that a group is used to construct the requisite family in the first place. 
Then the group theory allows us to reverse directions to construct the inferred dis- 
tribution on the parameter space. Technically this inference is possible due to the 
generalized spectral theorem. 

The technical constructions of probability distributions given in this paper stem 
from some methods used in quantum physics which are used for purposes other 
than those described here. We use technical approaches related to four types of 
concepts. One relates to the idea basic to quantum physics of "non-commutative" 
probability as described in Parthasarathy (1992) and Whittle (1992). A second 
concept, so-called "covariant probability operator- valued measures" is used in what 
may be described as statistical design problems in communication theory such as 
those found in Holevo (2001), Helstrom (1976), and Busch, Grabowski and Lahti 
(1995). The third concept, "coherent states", is described in Perelomov (1986) from 
a strictly group theoretic point of view and more generally in Ali, Antoine and 
Gazeau (2000). The fourth type of material is group representation theory itself as 
given in Vilenkin (1968). 

It should be noted that some statisticians are becoming interested in quantum 
physics from the point of view of how one should deal with quantum data. An 
overview of quantum theory and the relationship to statistical methods for dealing 
with quantum data is given in Malley and Hornstein (1993) and in Barndorff- 
Nielsen, Gill and Jupp (2003). Explanations of quantum theory and its relationship 
to statistical problems are outlined in the works of Holland, for example Helland 
(1999, 2003a, 2003b). However, in this paper we are not dealing with problems of 
quantum data analysis. We are simply using some technical methods which appear 
in the quantum physics literature as well as in the group theory literature for our 
own purposes. 

1.1. Noncommutative probability distributions 

The probability distributions we consider are obtained in a different manner than 
those of classical probability. In Parthasarathy (1992), the difference is explained 
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in the following manner. Suppose that we consider the expectation E[Y] of a real 
valued discrete random variable. For example, suppose possible value yi has prob- 
ability pi, for i = 1.2, . . . , k. One can express the expectation in terms of the trace 
of the product of two k x k diagonal matrices, S and O: 



E[Y] =^ piyi — trace 



/pi ■■■ 0\ 
' p 2 ••• ' 



(yi ■■■ \ 
2/2 ■■■ 1 



V • • • pk/ V • • • y fc / 



tr(SO). 



In this case since the two matrices are diagonal, they are commutative. However, 
noncommutative matrices (or more generally, linear operators) may be used to 
construct expectations. 

We begin by showing how to construct noncommutative probability distribu- 
tions. From there we go on to generate families of probability distributions, and 
finally, we construct inferred distributions for the Poisson and binomial families. 

We conceive of a "random experiment" as having two parts. The "input" status 
is represented by a linear, bounded, Hermitian, positive, trace-one operator S called 
a state operator. For example, if one were tossing a coin, the bias of the coin would 
be represented by a state operator; loosely speaking, the state of the coin. The 
measurement process (discrete or continuous) or "output" is represented by a linear 
self-adjoint operator, O, called an observable or outcome operator. So that, if one 
tossed the coin ten times, the measurement process would be to count the number 
of heads. These linear operators act in a complex separable Hilbert space H with 
inner product (•, •), which is linear in the second entry and complex conjugate linear 
in the first entry. 

Since the observable operator is self-adjoint, it has a real spectrum. We shall 
consider cases where the spectrum is either all discrete or all continuous. Although 
operators in a Hilbert space seem far removed from a probability distribution over 
possible results of an experiment, the relationship is made in the following manner: 

(i) The set of possible (real) results of measurement is the spectrum of the ob- 
servable operator O. (So, in the coin tossing experiment, O would have a 
discrete spectrum: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}.) 

(ii) The expected value for those results, using state operator S, is given by 
trace(SO). See Whittle (1992) and Helland (2003a, b). 

In order to obtain a probability distribution, the theory then depends upon the 
spectral theorem for self-adjoint operators. To each self-adjoint O, is associated a 
unique set of projection operators {£(£>)}, for any real Borel set B such that 

V{result £ B when the state operator is S} = trace(S £{B)). 

This set of of projection operators is called the spectral measure or the projection- 
valued (PV) measure associated with the self-adjoint operator O. A rigorous defi- 
nition of PV measure is given in Section 2.5. 

There are certain kinds of state operators that are simple to manipulate. They 
are the projectors onto one-dimensional subspaces spanned by unit vectors p> in the 
Hilbert space Ji. Since each such projection operator is identified by a unit vector 
in H, the unit vector itself is called a vector state. In this case, the trace formula 
becomes simplified to an inner product: trace(S£(B)) — {p, £(B)<p), where S is the 
projector onto the one-dimensional subspace spanned by unit vector ip. 

Note that if unit vector ip is multiplied by a scalar e of unit modulus, we obtain 
the same probability distribution as with the vector ip itself. Thus we distinguish 
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between a single unit vector <p and the equivalence class of unit vectors {eip} of 
which <p is a representative. We use the words vector state or just state to refer 
to an arbitrary representative of a unit vector equivalence class. Thus since, for 
complex number e of unit modulus, 

V {result € B when the state is ip} — V '{result <G B when the state is eip}, 

we take ip and eip to be the same state even though they are not the same vectors. 

From now on we reserve the use of the word state for vector states as described 
above. To designate a state which is an operator, as opposed to a vector, we use 
the phrase state operator. 

1.2. Discrete probability distributions 

Consider the case where the spectrum of O is purely discrete and finite, consisting 
of eigenvalues {yi}. Then the eigenvectors {rji} of O form a complete orthonormal 
basis for the Hilbcrt space H. In the infinite case, the Hilbert space is realized as 
an £ 2 space of square summable sequences. When the state is ip, the probability of 
obtaining result t/j is given by (ip, £{{yi})ip), where £({yi}) is the projection onto 
the subspace spanned by the eigenvectors of the eigenvalue yi . 

In particular, when the spectrum is simple, that is, when there is only one eigen- 
vector r]i for each eigenvalue y,, 

(1.2.1) V{result = yi when the state is ip} = \(ip, r/i)| 2 . 

In order to present examples, we must first decide where to start. The natural 
method in the performance of statistical inference is to start with a statistical model 
(say a parametric family of probability distributions) pertaining to the particular 
physical properties of a given random experiment. Then, perhaps, one may con- 
struct posterior distributions on the parameter space based upon observed results. 
However, here we attempt to construct prototype families for which the inference 
procedures that we illustrate below, can be put in place. 

Thus instead of starting with a statistical model for a particular situation, we 
start with an observable self-adjoint operator. As this paper progresses, it will 
become clear that selections of observables and families of states stem primarily 
from the selection of a Lie algebra. In this section, however, we consider an example 
of a PV measure by starting with a given observable operator in the case where its 
spectrum is discrete and, in fact, finite. 

Example 1.2. Consider an experiment with three possible results 1, 0, —1. Suppose 
the observable operator is 



Note that O is Hermitian. The eigenvalues of A are 1,0,-1, and corresponding 
eigenvectors are 



Once the measurement is represented by a self-adjoint operator O whose eigen- 
vectors serve as a basis for the Hilbert space, then the probability distribution is 
determined by the choice of state. 
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Consider the unit vector £ = - 

A 


1 / 


2 ] 


/U \ 


V 3* J 


V^(result = 


1) = 


l(m,£)| 2 


V^(result = 


0) = 


I 2 


V^result = - 


•1) = 





Using ()1.2.1|) we have, 



1 

14' 
4 

14' 

_9_ 

14' 



Part(b). Consider the unit vector V^o = ^ ( V% 



i 



The probabilities for results 1,0,-1 are |, |, | respectively. We see here 
how the choice of the state determines the probability distribution. 

Part(c). Suppose the experiment has rotational symmetry and the probabilistic 
model does not change under rotation in three-dimensional space. Consider a 
family of states corresponding to points on the unit sphere indexed by angles 
P and 9 where < P < 2tt, < 9 < tt. Let 



fl>p, e = ( e-*" cos 2 f , ^ sin0, e*» sin 2 f 



Then V^^ g (result = 1) = 1(^1,^/3,6)1 



2 4 u 

-cos -, 

|2 1 • 2 a r, „• 2 $ „ 2 



V^pg (result = 0) = |(r/ 2 , V/3,e)| = ^ sin # = 2sin -cos -, 



V-^p e (result = -1) = 1(773,^^)1 



2 ■ A u 

- sm -. 



Relabel the possible values: 1, 0,-1 as 0, 1, 2, and let p — sin — . Then 
this family becomes the binomial distribution with n = 2. 



1.3. Continuous probability distributions 

In the case where the observable self-adjoint operator O has a purely continuous 
simple spectrum sp(0), (that is, there exists a vector tjj in the domain of O such 
that finite linear combinations of O n, 0o are dense in the domain), the Hilbert space 
is realized as an C 2 (sp(0),fi) space of complex- valued square integrable functions 
of a real variable x with inner product 



(ip(x),(j)(x)) = ijj(x)*(f)(x)u(dx), 

Jsp(O) 

for some finite measure /i with support sp(0), where * indicates complex conjugate. 
From the spectral theorem (Beltrametti and Cassinelli (1981)), we have the result 
that self-adjoint operator O determines a unique projection-valued (PV) measure 
{£(£})} for real Borel sets B. In that case, integrating with respect to the PV 
measure £(B), we have formal operator equations: 
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(i) f £{dx) = I. 

This should be understood in the sense that (ip(x), £(dx)ip(x)) is a probability 
measure on sp(0),Vip G the domain of O, since (■tp(x),£(B)tp(x)) is well 
defined on any Borel set B C sp{0). 

(ii) 0=1 x£{dx). 

Jsp(O) 

It follows that for certain operator functions f(0), we have 

f(O) = f f(x)£(dx). 

In particular, let xb(x) be the characteristic function for Borel set B and let the 
corresponding operator function be designated as Xb{0). Then 

XB (0) = f X B(x)£(dx). 

Jsp(O) 

For vector £ in the domain of O, 

(S,Xb(0)Q= [ X B(x)(t(x),£(dx)Z(x)). 

Jsp(O) 

When the Hilbcrt space is constructed in this manner, we say the particular 
Hilbert space realization is "diagonal in O" or is "O-space". In that case, the O 
operator is the "multiplication" operator thus: 0£,(x) = x£(x), (which explains why 
the spectral measure for Borel set B is just the characteristic function of that set). 

In the diagonal representation of the Hilbert space, since the projection operators 
{£(B)} are simply the characteristic function xb{0) for that Borel set, we have a 
simplified form for the probability formula. For unit vector ip in the domain of O: 

V^{0 result e B) = (V>, Xs(O)V') = / H(x)\ 2 fi(dx). 

Jb 

Note that, in this O-diagonal space, the probability distribution is determined by 
the choice of state ip. 

It is possible to have spectral measures associated with operators which are 
not self-adjoint. Normal operators also uniquely determine spectral measures but 
the spectrum might be complex. Subnormal operators are associated with spectral 
measures in which the operators F{B) for complex Borel set B, are not projection 
operators but are positive operators. We will be using spectral measures of this 
sort, called "positive-operator- valued" (POV) measures (ref. Section 2.5), instead 
of projection-valued (PV) measures when we consider probability distributions on 
parameter spaces. 

Example 1.3. We consider the self-adjoint operator Q where Qip(x) = xtjj(x),ip £ 
£ 2 (sp(Q),dx) = H. The Hilbert space H is diagonal in Q, which represents the 
measurement of one-dimensional position in an experiment. The spectrum sp(Q) 
of Q is the whole real line R. 

We choose a state (function of x), ip(x) — ^na 2 ) 1 / 4 ^ or a > ^' Then 

reposition e B} = {il){x),£(B)i>(x)) = {^(x), X b{QMx))= [ \4>{x)\ 2 dx. 

Jb 

Thus, the probability density function for the distribution is the modulus squared 
of ip(x) which is the normal density function with zero mean and variance = a 2 . 
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1-4- Groups and group representations 

We will consider groups, transformation groups, Lie groups, Lie algebras, and rep- 
resentations of them by linear operators in various linear spaces. 

l.^-.l. Group representations 

Let G be a group with elements denoted by g. Let T denote a linear operator on 
a linear space H, a complex Hilbert space. If, to every jeG, there is assigned a 
linear operator T(g) such that, 

(i) T( gi g 2 )=T(g 1 )T(g 2 ), 

(ii) T(e) = I, 

where e is the identity element of G and / is the unit operator on TL, then the 
assignment g —* T(g) is called a linear representation of G by operators T on TL. 
Usually, the word linear is omitted when referring to linear representations. The 
dimension of the representation is the dimension of the linear space TL. 
A representation is called projective if (i) above is replaced by 

(i') T(g l92 ) = e(g u g 2 ) T( 9l )T(g 2 ), \e(g ugi )\ = 1. 

A representation is called unitary if each T(g) is a unitary operator. 

Two representations T(g) and Q(g) on linear spaces TL and K. are said to be 
equivalent if there exists a linear operator V, mapping TL into JC with an inverse 
V- 1 such that Q(g) = VT(g)V~ 1 . 

A subspacc TL\ of the space TL of the representation T(g) is called invariant if, 
for ip £ TCi,T(g)ip £ TL\ for all g £ G. For every representation there are two 
trivial invariant subspaces, namely the whole space and the null subspace. If a 
representation T{g) possess only trivial invariant subspaces, it is called irreducible. 

We shall be concerned with irreducible, projective, unitary representations of two 
particular groups. 

1.4-2. Transformation groups 

By a transformation of a set f2, we mean a one-to-one mapping of the set onto 
itself. Let G be some group. G is a transformation group of the set £1 if, with each 
element g of this group, we can associate a transformation w — > gu) in where for 
any w £ Q, 

(i) (SiS^V = 9i(92^) and 

(ii) eu — u>. 

A transformation group G on the set £1 is called effective if the only element g 
for which gu> = u) for all ui £ O, is the identity element e of G. An effective group 
G is called transitive on the set f2, if for any two elements lui,lu2 € £1, there is 
some g £ G, such that 012 = gu)\. If G is transitive on a set f2, then £1 is called a 
homogeneous space for the group G. 

For example, the rotation group in three-dimensional Euclidean space is not 
transitive. A point on a given sphere cannot be changed to a point on a sphere of 
a different radius by a rotation. However, the unit two-sphere is a homogeneous 
space for the rotation group. 
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Let G be a transitive transformation group of a set Q, and let u be some fixed 
point of this set. Let H be the subgroup of elements of G which leave the point lo 
fixed. H is called the stationary subgroup of the point u> . Let u>i be another point 
in £1 and let the transformation g carry io into u>i. Then transformations of the 
form ghg -1 , h € H leave the point u>\ fixed. The stationary subgroups of the two 
points are conjugate to each other. 

Take one of the mutually conjugate stationary subgroups H and denote by G/H 
the space of left cosets of G with respect to H. The set G/H is a homogeneous space 
for G as a transformation group. There is a one-to-one correspondence between the 
homogeneous space £1 and the coset space G/H. For example, consider the rotation 
group in three-dimensional space represented by the group of special orthogonal 
real 3x3 matrices 50(3). The set of left coscts SO(3)/SO(2) can be put into 
one-to-one correspondence with the unit two-sphere. 

1.4-3. Lie algebras and Lie groups 

An abstract Lie algebra Q over the complex or real field is a vector space together 
with a product [X, Y] such that for all vectors X, Y, Z in Q and a, b in the field, 

(i) [X,Y} = -[Y,X], 

(ii) [aX + bY, Z] = a[X, Z] + b[Y, Z] , 

(iii) [[X, Y],Z} + [[Z, X],Y] + [[Y,Z),X}=0. 

A representation of an abstract Lie algebra by linear operators on a vector space 
such as a Hilbert space TL 1 is an algebra homomorphism in the sense that the 
representation operators have the same product properties as those of the original 
abstract algebra. For an associative vector space of linear operators, the product 
operation [A, B] is the commutation operation [A, B] = AB — BA. 

We will consider representations of two Lie algebras of dimension three. If the 
basis elements are linear operators E\, E2, E%, we may indicate a general element 
as a linear combination X = aE\ + bE 2 + cE 3 . The scalar parameters {a,b,c} or 
a subset of them will then become the parameters of the associated probability 
distribution. 

A Lie group is a topological group which is an analytic manifold. The tangent 
space to that manifold at the identity of the group is called the Lie algebra of the 
group. It can be shown that the Lie algebra of a Lie group is an abstract Lie algebra. 
In the case of a linear representation of a Lie group, the associated Lie algebra can 
be computed explicitly by differentiating curves through the identity. On the other 
hand, a (local) Lie group associated with a given Lie algebra can be computed 
explicitly by the so-called exponential map. (See, for example, Miller (1972).) 

For our purposes, we focus upon the parameter space which a Lie group repre- 
sentation inherits from its Lie algebra representation via the exponential map. 

1 . 5. Families of probability distributions 

Let G be a group and g — > U(g) be an irreducible projective unitary representation 
of G in the Hilbert space H. For fixed unit vector ip , the action of U(g) on t/> is 
designated by, 



(1.5.1) 
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Since each U(g) is unitary, each ip g is a unit vector and so can serve as a state. 
This method of generating a family leads to states designated as "coherent states" . 
The name originated in the field of quantum mechanics. However we use it in a 
purely group theoretic context, as in Perelomov (1986) and Ali, Antoine, Gazeau 
(2000). In the two families of probability distributions that we consider in detail, 
the corresponding families of states are coherent states. See the references given 
above for properties of coherent states along with examples and suggestions for 
generalizations. Families of states lead to families of probability distributions. Thus, 
for self-adjoint operator O in O-diagonal Hilbert space Tt, 

V^O result in B} = (U(g)i> ,£(B)U(g)i> ) = ^|(f/(.g)Vo, ^)| 2 - 

In the discrete case, where r\i is the eigenvector corresponding the ith eigenvalue 
of O, each eigenspace is one-dimensional, and the sum is over all eigenvalues in B, 
and in the continuous case, 

V^iO result in B} = (U{g)^,E(B)U{g)^)= f \U(g)M^)\ 2 
2. The Poisson family 

We construct the Poisson family by first constructing a particular family of coherent 
states of the form (|1.5.ip in an £ 2 Hilbert space TLn- The family is indexed by a 
parameter set which also indexes a homogeneous space for a certain transformation 
group; namely, the Weyl-Heisenberg group, denoted Gw- Representation operators 
T(g),g € Gw, acting on a fixed vector in TLn as in (jl.5.1|) . generate the coherent 
states which, in turn, generate the family of probability distributions which leads 
to the Poisson distribution. This provides a context for an inferred probability 
distribution on the parameter space. 

2.1. Representations of the Weyl-Heisenberg Lie group 

The group Gw can be described abstractly as a three-parameter group with ele- 
ments g(s] xi, x 2 ), for real parameters s, x\ and x 2 , where the multiplication law is 
given by 

(s;xi,x 2 )(t;yi,y 2 ) = (s + t+ ^(xiy 2 - y\x 2 ); x x +yi,x 2 + y 2 ^j ■ 

Alternatively, we may consider one real parameter s and one complex parameter 
a, where 

(2.1.1) a = -^=(-xi +ix 2 ). 

Then, (s; a)(t; (3) = (s + t + Im(a/3*); a + (3). The Lie algebra Gw of the Lie group 
Gw is a nilpotent three-dimensional algebra. Basis elements can be designated 
abstractly as e±, e 2 , ea, with commutation relations [ei, e 2 \ = ea, [ei, 63] = [e 2 , e^} — 
0. We consider a linear representation of the algebra with basis given by linear 
operators Ej, for j = 1,2,3, which operate in a Hilbert space Tt. These operators 
are such that operators iEj are self-adjoint with respect to the inner product in 7i. 
That property is necessary in order that from this algebra we may construct group 
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representation operators which are unitary. Since the representation is an algebra 
homomorphism, the linear operators Ej have the same commutation relations as 
the abstract Lie algebra above. 

It will prove to be convenient to consider an alternative basis for the three- 
dimensional linear space of those operators. Put 

A=-j=(Ei-iE2), = —L(E 1 +iE 2 ), I=-iEk. 

Note that, due to the fact that the iEj operators are self-adjoint, A^ is indeed 
an adjoint operator to A. Although A and A> are not self-adjoint operators, the 
operator N = A' A is self-adjoint. We have 

(2.1.2) [A,At]=I, [A, I] = [A\l] = 0. 

A general element of the Lie algebra of operators described above is given by a 
linear combination of the basis vectors such as X = isl + aA^ — a* A. This form 
of linear combination derives from X = X\E\ + x^E^ + sE 3 and Q2.1.1]) . We may 
now proceed to obtain a group representation by unitary operators T in the Hilbcrt 
space TL. By virtue of the exponential map we have, (s; a) — > T(s; a) = exp(X). 
Since the I operator commutes with A and A', we may write T(s;a) — e ls D(a) 
where D(a) — exp(a^ — a* A). It is known that this representation is irreducible. 

2.2. The Hilbert space of the irreducible representation 

The linear operators mentioned above act in a Hilbert space which has been desig- 
nated abstractly as H. In order to consider concrete formulas for probability distri- 
butions, it is necessary to give a concrete form to the Hilbert space. In the case of 
the Poisson family, the space designated Hn is realized as an £ 2 space of complex- 
valued square summable sequences with basis consisting of the eigenvectors of the 
self-adjoint operator N. 

By the so-called "ladder" method, using (|2.1.2|) , it has been found that N has 
a simple discrete spectrum of non-negative integers. Thus, by the general theory, 
its eigenvectors, fc = 0, 1, 2, . . . , form a complete orthonormal set in Ti which 

forms a basis for the £ 2 Hilbert space realization Ti n . 

In Ti n, we have the following useful properties of the A (annihilation), A^ (cre- 
ation), and N (number) operators. 

A(f> = 0, A<j> k = Vk 4> k -i for k= 1,2,3,..., 

(2.2.1) A^k = VfcTT fe+ i for fc = 0,1,2,3,..., 
Nfa = k<l>k for k = 0,1,2,3,.... 

Then we can relate <pk to <po by 

(2.2.2) ^ = * (At) fc o for k = 0,1,2,.... 

V k\ 

2.3. Family of coherent states generated by group operators 

To construct a family of coherent states in TL^ leading to the Poisson distribution, 
we operate on the basis vector <j> Q with D(a) operators indexed by complex number 
a, writing 



(2.3.1) 



v(a) = D(a)4> - 
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To find an explicit formula for the v vectors, write D(a) as a product of exponential 
operators. Since A and A^ do not commute, we do not have the property which 
pertains to scalar exponential functions that D(a) = exp(a^) exp(— a* A). We use 
the Baker-Campbell-Hausdorff operator identity: 

(2.3.2) exp(Oi) exp(0 2 ) = cxp [±[Oi,(h]j exp(Oi + O a ) 

which is valid in the case where the commutator [Oi,02] commutes with both 
operators 0\ and 0%. Putting 0\ = aA\ O2 = —a* A, we have 

D(a)<t> = e~ |a|2/2 exp(aA t )exp(-aM) <f> . 

For linear operators, we have the same kind of expansion for an exponential operator 
function as for a scalar function. Expanding exp(— a* A)4>$ and using (|2.2.1|) . we find 
that exp(— a*A)(f>Q = I </>q. Then from (|2.2.2[) . we see that (A^) k (j)o = Vkl (j) k - From 
(j2~2~Tjl and (|2XT|) . 

OO 

(2.3.3) D(a)d> Q = e-M 2 / 2 J2 4r 

fc=o v k - 

2-4- Family of probability distributions 

Let the observable (self-adjoint) number operator TV represent the physical quantity 
being "counted" with possible outcomes 0, 1, 2, ... . Using the family of coherent 
states given above, the probability distributions are, 

V v ( a ) {result = n} = \(<f> n , v(a))\ 2 . 

By expression (|2.3.3|) and the orthogonality of the basis vectors, 

00 k 1 1 1 2 \ 71 

(2.4.1) ( „,, (a )) = e -H 2 / 2 ^^_ (0n , 0fc) = e -H 2 iNX 

^ — ^ /h\ 77.1 



k=Q v h '- 

for n = 0, 1, 2, 3, Taking the modulus squared, using (|1.2.1[) . we have the formula 

for the Poisson family, 

V v(a) {result = n} = e -M 2 iM_L / r 71 = 0, 1, 2, 3, . . . . 

n! 

Put the Poisson parameter A = \a\ 2 . Thus we see that A is real and nonnegative. 

It may be remarked that this is a complicated method for obtaining the Poisson 
family. The point is that we now have a context in which to infer a probability 
distribution on the parameter space, given an observed Poisson value n. 



2.5. POV measures versus PV measures 



Consider the definition of a projection- valued (PV) measure, or spectral measure 
(see, for example, Busch, Grabowski, and Lahti (1995)), which had been introduced 
heuristically in Section 1.2. 
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Definition. Let B(M) denote the Borel sets of the real line M and A(7i) denote the 
set of bounded linear operators on Ti . A mapping £ : B(M.) — > A(7i) is a projection 
valued (PV) measure, or a spectral measure, if 

£{B)=£^B)=£ i {B), VBeB(R), 

£(R) = I, £{^3,) = ^£{Bi) for all disjoint sequences {BJ c S(K), 

i 

where the series converges in the weak operator topology. This spectral measure 
gives rise to the definition of a unique self-adjoint operator O defined by O = 
J x£(dx), with its domain of definition 

(2.5.2) V(0) = iip € H,s.t. (ip, J x 2 £(dx)ip \ = J x 2 (ip, £ {dx)^) converges j . 

Given a self-adjoint operator O with domain of definition T>(0) C 7i, there is a 
unique PV measure £ : B(R) -> k(H) such that V(0) is (|2X2|) . and for any V € 
'D(0) 1 (ip,Oip) — J x (tp, £ (dx)i>). Therefore there is a one-to-one correspondence 
between self-adjoint operators O and real PV measures £ . 

In the case of the Poisson distribution, the self-adjoint operator is N with normal- 
ized eigenvectors {<t>k} as the orthonormal basis of Ti, and sp(N) — {0, 1,2,.. .}. For 
an inferential probability measure operator on the parameter space associated with 
the Poisson distribution, we will have neither a PV measure nor a self-adjoint oper- 
ator. Instead we will have a subnormal operator and a so-called positive operator- 
valued (POV) measure, where the first line of (|2.5.1[) is amended to read 

£{B) is a positive operator for all B 6 S(M). 

The operators for PV measures are projections. The properties prescribed for those 
projections £ (B) are just those needed so that the corresponding inner products 
(ifj, £{dx)ip) for vector states ip will have the properties of probabilities. However, 
for the definition of probability there is no requirement that the operators be pro- 
jections. In fact, if they are positive operators, they can still lead to probabilities 
when put together with states. 



2.6. An invariant measure on the parameter space 

Now we consider the inferential case. In a sense we reverse the roles of states 
and observable operators. If the Poisson value n was observed, what was formerly 
a vector cf> n denoting a one-dimensional projection £({n}), now becomes a state. 
What was formerly a family of coherent sates, v(a), now leads to the construction 
of a POV measure on the parameter set. 

In order to obtain a measure on the parameter space C which is invariant to an 
operator D((3), for arbitrary complex number (3, we need to see how the operator 
transforms a coherent state v(a). Consider D((3)v(a) — D(/3)D(a)cf>o, Using the 
Baker-Campbell-Hausdorff identity (|2X2"1) with O x = f3A^ - (3* A and 2 = aA^ - 
a* A, we have D{f3)D{a)4> = e iIm ^ a "> D((3 + a)4> 0l As states, D(j3)v{a) = v(/3+a). 
Thus, the operator D acts as a translation operator on the complex plane so that 
the invariant measure, dn(a), is just Lebesgue measure, dfi(a) = cdct\dai, where 
a = oi\ + ict2, and where c = 1/tt by normalization. 
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2. 7. An inferred distribution on the parameter space 

By general group theory, the irreducibility of the group representation by unitary 
operators in the Hilbert space Ti. implies that the coherent states are complete in 
TL. (See, for example, Perelomov (1986)). Thus, for any vectors tpi and ip2 in H, 

(ipi, ih) = J (ipi,v(a)) (v(a),ip 2 ) du(a). 

The coherent states form a so-called "overcomplete" basis for a Hilbert space in the 
sense that they are complete and can be normalized but are not orthogonal. The 
Hilbert space Tics which they span may be visualized as a proper subspace of L 2 (C), 
the space of square integrable functions of a complex variable with inner product 
(f(a) 7 g(a))cs = f c f{ct)*g{oi)dn{a). In Ali, Antoine, Gazeau(2000), an isometric 
map p is given which associates an element (j) in Ti^ with element (function of a) 
in Tics'- p(</>) = (0, v i a ))H N ■ In Hcs we construct a POV measure M which leads 
to a probability distribution for a defined by 

V{a S A for state ip} = (tp, M(A)ip) — / \(ip, v{a))\ 2 dp{a) 

J A 

for complex Borel set A and for state ip. In particular, consider the (eigenvector) 
state ip = 4>n corresponding to an observed Poisson value n, an eigenvalue of the self- 
adjoint number operator N. V{a € A for state 4> n } ~ J A \(4> n , v(a))\ 2 dp,(a). This 
provides us with a probability distribution on the whole parameter space, namely, 
the complex plane. But the Poisson parameter, a real number, is the modulus 
squared \a\ 2 of a. Expressing a in polar coordinates, a — re l9 , with r > and 
< 9 < 2tt, we obtain the invariant measure d/i(a) — ^dr 2 dd = ^rdrdO. Then 
integrating 6 from to 27r, we obtain the marginal distribution for r 2 as follows. 
For real Borel set B, 

V{r 2 e B for state </>„} = / e~ r — / d9)dr 2 = / e" A — dX, 

J B n\ \2ir J ) J B n\ 

where A = r 2 and the expression for \ ((j> n , w(a))\ 2 is obtained similarly as in (|2.4.1[) . 
We see that this corresponds to a Bayes posterior distribution with uniform prior 
distribution for the parameter A. 

3. The binomial family 

We construct the binomial family similarly as was done for the Poisson family. 
In this case the coherent states are built from irreducible representations of the 
Lie algebra of the rotation group SO (3) of real 3x3 orthogonal matrices with 
determinant one, instead of the Weyl-Heisenberg group. The Weyl-Heisenberg Lie 
algebra is three-dimensional nilpotent whereas the Lie algebra corresponding to 
SO (3) is three-dimensional simple. 

3.1. The rotation group and Lie algebra 

Although there are nine matrix elements in a 3 x 3 real matrix, the constraints of 
orthogonality and unit determinant for an element g of 5*0(3), imply that g can be 
identified by three real parameters. There are two general methods for indicating 
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the parameters. One way is by using the three Euler angles. The other way is to 
specify an axis and an angle of rotation about that axis. 

The rotation group is locally isomorphic to the group SU(2) of 2 x 2 complex 
unitary matrices of determinant one. An element u of SU{2) is identified by two 
complex numbers a and [5 where \a\ 2 + |/3| 2 = 1. The relationship between 50(3) 
and SU (2) is that of a unit sphere to its stereographic projection upon the complex 
plane as shown in Naimark (1964). Although the relationship is actually homo- 
morphic, (one g to two u), they have the same Lie algebra and so can be used 
interchangeably in the context presented here. It is more intuitive to work with 
50 (3) but from the point of view of the binomial distribution, it will turn out to 
be more pertinent to work with SU(2). Both 50(3) and SU(2) are compact as 
topological groups (Vilenkin (1968)). 

In this case, since we start with matrices, basis elements of the Lie algebra can be 
easily obtained by differentiating the three matrices corresponding to the subgroups 
of rotations about the three spatial coordinate axes (x, y, z). Thus, for example, the 
subgroup of 5*0(3), indicating rotations about the z axis is given by 



a 3 (t) 



Differentiating each matrix element with respect to t and then setting t = 0, we 
obtain the algebra basis element 

e3 = 






Similarly, we obtain the three basis elements, e\, e 2 , e 3 with commutation relations 
(3.1.1) [ei,e 2 ] = e 3 , [e 2 ,e 3 ] = ei, [e 3 ,e 1 ] = e 2 - 



3.2. A homogeneous space for the group 

The rotation group G = 50(3) acts as a transformation group in three-dimensional 
Euclidean space. However, 50(3) is not transitive on the whole space. It is transitive 
on spheres. We take the unit two-sphere S 2 as a homogeneous space for the group. 
But there is not a one-to-one relationship between group elements and points on the 
unit sphere. A one-to-one relationship (excluding the South Pole of the sphere) is 
provided by the cosets G/H^p of the group with respect to the isotropy subgroup 
H NP of the North Pole (0,0, 1) of § 2 . In 50(3), the subgroup is the group a 3 (t) 
of rotations about the z axis. In SU(2), the subgroup U{1) is the set of diagonal 
matrices h(t) with diagonal elements e lt and e~ lt . 

Following Perelomov (1986), we consider cosets 50(3)/50(2) or cosets SU(2)/ 
U(l). The one-to-one relationship of cosets SU(2)/U(1) with the unit sphere S 2 
(excluding the South Pole) is given in the following manner. Given a point v on the 
unit sphere indicated by 

(3.2.1) v = (sin6>cos7, sin6>sin7, cos0), 

associate the coset g v where, g v = cxp (^(sm^Mi — 00S7M2)), where the matrices 
Mi and M 2 are given by Mi = ( ® ^ ] , M 2 = ( ° j . In terms of rotations, 
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the matrix describes a rotation by angle 9 about the axis indicated by direction 
(sin 7, — cos 7, 0) which is perpendicular to both the North Pole and the vector v. 
We can express a general element u of SU (2) by 

(3.2.2) u = g v h, where g v € coset SU(2)/U(1), heU{l). 
3.3. Irreducible representations 

Now we consider a representation of the algebra with basis given by linear operators 
Ek, for k = 1, 2, 3 which operate in a Hilbert space TL. Since the group is compact, 
general theory provides the result that irreducible representations correspond to 
finite dimensional Hilbert spaces. In the algebra representation space, the basis 
elements have the same commutation relations as (|3.1.1J . Also, we require that the 
operators Jk = iE/~ be self-adjoint with respect to the inner product of Ti.. 

Similarly as in Section 2.1, introduce creation and annihilation operators J+ = 
J i + iJ%, </_ = J i — iJ<i- It is known that the complete set of irreducible repre- 
sentations of the Lie algebra is indexed by a non-negative integer or half-integer j 
while the dimension of the representation space is 2j + l. Correspondingly, the com- 
plete set of unitary irreducible representations Tj(u) of the group SU(2) is given by 
j = 0, 1/2, 1, 3/2, .... Since the relationship of SU(2) to the rotation group 50(3) 
is two-to-one, the irreducible representations of the rotations group are more prop- 
erly indexed by the non-negative integers, omitting the half-integers. We will see 
that the parameter n for the binomial distribution is equal to 2j implying that 
we need the non-negative half- integers in the list. Thus we focus upon the group 
SU(2). Choose and fix the number j indicating a definite Hilbert space Ttj. An 
orthonormal basis for TLj is provided by the eigenvectors <f> m of the self-adjoint 
operator J3 which, for fixed j, has a simple discrete and finite spectrum indexed 
by m — —j, — j + 1, . . . , j — As operators in TLj, J+, J_ and J3 have creation, 
annihilation, and number properties similarly as in (]2 . 2. 1[) : 

J+4>j= 0, J+4> m =y/ (j -m){j + m+ 1) <?Wi , for m = -j, + . . . , 
(3.3.1) J-<t>-j= 0, J-(t>m =V {j + m)(j - m + 1) 4> m -i, for m= + -j + 2, . . . , j, 
J30m=m0 m , for m=-j,-j + l,...,j-l,j. 

Note that in (|2.2.1[) there is a minimum basis vector 4>q, but no maximum indi- 
cating an infinite dimensional Hilbert space. Here we have both a minimum and a 
maximum basis vector. We relate <p m and 4>-j by 

For fixed number j, and for each u £ SU(2), let u — ► T J '(u) denote an irreducible 
representation of SU(2) in the Hilbert space TLj, where each operator is unitary 
with respect to the inner product in TLj. From (|3.2.2|) . we have T J (u) = D{y)T^{h) 
for heU(l), where D(u) =T^{g„), 

T J (g u ) = exp(i6(sinjJi — COS7J2)), for < 9 < n. 

It can be shown that for h £ U(l), T^(h) is a scalar factor of modulus one. Thus 
we focus upon D(i>). 
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3-4- The family of coherent states and the binomial family 

We choose the element 4 s -j as the fixed vector in Tij. Then similarly as in Section 2, 
the family of coherent states, is given by 

w{v) — D{v)(j)-j — exp («6'(sin7Ji — cos7J2))0_j, for < 9 < n. 

As in Perelomov(1986), we find it convenient to re-parameterize, similarly as in Sec- 
tion 2.1, and use one complex parameter £ along with the creation and annihilation 
operators instead of the two real angle parameters with the J\ and J2 operators. 
Thus for £ = ^(sin7 + zcos7), we have D(£) = exp(£J + — £*</_). We seek an 
explicit expression for the coherent states w(v). As in Section 2.3, the method is to 
factor the exponential function. The Baker-Campbell Hausdorff formula is not con- 
venient to use in this case as it was in Section 2.3. Instead, the Gauss decomposition 
of the group SX(2,C) is used. We obtain, £>(£) = exp ((J+) exp (77J3) exp (£' J_), 
where C = - tan(6»/2)e^, r) = -21ncos|£| = ln(l + |C| 2 ), C' = ~C*- Finally, using 
(13.3. ip and (|3.3.2[) , we obtain coherent states 



'(0 = E 



m=—j 1 

In terms of angle parameters, 



j+m 



(j + m)\(j-m)\ (1 + KI 2 )' 



,3.4.1, „(«,, lh ± J + J^'_ m) , (- Sto |) "'(cos ») '^-h 

Thus, noting that the possible result values arc the eigenvalues of J3, namely, m 
-j, ~j + 1, • • • , i - 1, 3, we have 

P{result = £, when the state is ui(0,7)} = |(0£, w(0, 7))| 2 . 

Using the fact that the eigenvectors 4> m of J3 are orthonormal, we have, 

( "- 2) ( *"-<^ )) " l/ o + W-fll HO^ (»4)"'^" J+ ' h 

Upon taking the modulus squared, we have, 

(2j)! ' 



V w(en) {result = t} = ^—^—^^-j (^cos 2 

for I = —j, —j + 1, . . . , j — 1, j. For the binomial (n,p) distribution, put n = 2j, 
renumber the possible values by k = j + I and put p = sin 2 9/2. 



3.5. An inferred distribution on the parameter space 

The parameters 9 and 7 index the parameter space, that is, the points of the 
unit two-sphere, which is isomorphic to the cosets SU(2)/U(1), or equivalently 
SO(3)/ SO(2), and where points are given by the (three-dimensional) vector v as in 
(|3.2.1[) . In other words, we can take the unit sphere to be the parameter space. The 
coherent states, also indexed by the point are complete in the Hilbert space TLj. 
As before, we have an isometric map from Tij to the Hilbert space Ti. J cs spanned 
by the coherent states. Since D{y) takes one coherent state into another coherent 
state, we have the action of D(v) on Ti-cs- The (normalized) measure invariant to 
the action of D{v) is Lebesgue measure on the sphere: d(9,~/) — -^r— sia 9 dOdrf. 
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Suppose that we have an observed binomial count value k which, with k = j + 1, 
gives I = k — j, for possible values I = —j, —j + 1, . . . , j — l,j corresponding to 
possible values k — 0, 1, 2, . . . , 2j. Then the corresponding inferred distribution on 
the parameter space, derived from a POV measure is 

T{(6, 7) G A when the state is fa} = / \(fa, w(v))\ 2 d(6, 7), 

where the inner product inside the integral sign is that of Ti.j . From the expression 
for w{v) in (|3.4.1| and the fact that the vectors <j) m are orthonormal in Hj giving 
the inner product (13.4.2|) . we have the joint distribution of 6 and 7: 



2j + 1 
4tt jj a 

2j + l 
An 



(j+ty.u-ty- 
m 



sin 6dddj, 



sin OdOdj. 



For the marginal distribution of 9, we integrate 7 from < 7 < 2ti obtaining. For 
B, a Borel set in [0, ir), 



V{0 £ B when the state is 0^} 



2j + l 
2 

n + 1 
2 

(n + 1) 



b Ci + W-Oi 

u! 



sin 



cos 



sin 



s fc!(n — fc)! 



cos 



n—k 



sinede. 



^-/(l-p)"-' 1 ^, 



for p — sin 2 |, implying a uniform prior distribution for the canonical parameter p. 



4. Discussion 



We have constructed a group theoretic context for the two discrete probability 
distributions, Poisson and binomial. Similarly as other group invariance methods, 
the idea is to construct probability families by group action. However, in contrast 
to others, we have neither a pivotal function nor group action on the value space 
of the random variable. Thus our method is applicable to the discrete case. In this 
paper the Poisson and binomial families were constructed by using the properties 
of certain families of vectors (coherent states) which due to their completeness 
property enable the construction of measures leading to inferred distributions on 
the parameter spaces. The formulas for the inferred distributions obtained in those 
two examples coincided with Bayesian posterior distributions in the case where the 
prior distributions were uniform. We emphasize the fact that although the formulas 
for the two methods coincide in the end result, the two methods are distinctly 
different. 

This difference may be illustrated by considering Thomas Bayes' justification 
for a uniform prior in the binomial case elucidated in Stigler (1982). Here the 
emphasis is not on the parameter itself, but rather on the marginal distribution 
of the binomial random variable X obtained from the joint distribution of X and 
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parameter p. Starting with a joint distribution applied to a particular binomial 
physical situation (billiard table example) in which the parameter has uniform 
distribution and integrating out the parameter obtaining the marginal distribution 
for X, one obtains the result that X has a discrete uniform distribution. Then the 
reasoning is that, in the face of no prior knowledge, one assumes a discrete uniform 
distribution for X for all n implying a uniform prior distribution for p. Stigler 
notes that if P(X — k) is constant, then so is P(f(X) = f(k)) for any strictly 
monotone function f(x), thus answering the objection raised against the principle 
of insufficient reason where a uniform distribution for a given parameter would not 
be uniform for every monotone function of it. The argument raised against this 
approach of Bayes is that it is too restrictive in that it "is very strongly tied to the 
binomial model." 

The group theoretic method operates in a different context. There is no joint 
distribution of random variable and parameter and consequently no marginal dis- 
tribution for the random variable. One starts by constructing an ordinary family 
of probability distributions indexed by parameters obtained from a chosen para- 
metric group. To obtain the inferred distribution on the parameter space, the roles 
are exchanged in that the observed value of the original random variable acts as a 
parameter and the former parameters are treated as random. The original random 
variable and parameters are never random at the same time. The reversal in roles 
is possible technically because of the completeness property of the coherent states 
which were used in the first place to construct the family. In the binomial case, the 
relevant group is the matrix group SU(2) and the consequent invariant distribution 
is Lebesgue measure on the 2-sphere. Upon integrating out the azimuthal angle 
7, we obtain sinOdO for the polar angle which, with a slight change of variable 
yields a probability distribution for parameter p which is the same formula as a 
Bayesian posterior based upon uniform prior for p. We obtained similar results for 
the Poisson distribution using the Weyl-Heisenberg group. Clearly, the list of dis- 
crete distributions with associated groups can be extended. Results (unpublished) 
relating to matrix group SU(1, 1) have been obtained for the negative binomial 
distribution. Unlike the case of binomial, our results do not imply uniform prior for 
the commonly used parameter p as given, for example, in Q. 

Efron (1998) has indicated a relationship between the fiducial method of inference 
and the Bayesian method as follows: "By 'objective Bayes' I mean a Bayesian theory 
in which the subjective element is removed from the choice of prior distribution; in 
practical terms a universal recipe for applying Bayes theorem in the absence of prior 
information. A widely accepted objective Bayes theory, which fiducial inference was 
intended to be, would be of immense theoretical and practical importance." 

From the Bayesian point of view, one may interpret this paper as an objective 
method for obtaining a reference prior in the absence of prior information. From 
another point of view, one might interpret this paper as a way of obtaining inferred 
distributions on parameter spaces without the use of the Bayes method. 

Acknowledgments. We thank the editor, the associate editor and the referee for 
their valuable comments and suggestions. 

References 

[1] Ali, S., Antoine, J. and Gazeau, J. (2000). Coherent States, Wavelets and 
Their Generalizations. Springer- Verlag, Berlin. MR1735075 



Group invariant inferred distributions via noncommutative probability 



19 



[2] Barndorff-Nielsen, O., Gill, R. E. and Jupp, P. D. (2003). On quantum 
statistical inference. J. R. Statist. Soc. B 65 775-816. MR2 0178711 

[3] Beltrametti, E. G. and Cassinelli, G. (1981). The Logic of Quantum Me- 
chanics. Encyclopedia of Mathematics and Its Applications, Vol. 15. Addison- 
Wesley Publishing Co., Reading, MA. MR 0635780I 

[4] Busch, P., Grabowski, M. and Lahti, P. (1995). Operational Quantum 
Physics. Springer- Verlag, Berlin. MR1356220 

[5] Casella, C. and Berger, R. L. (2002). Statistical Inference, 2nd ed. 
Duxbury, Pacific Grove, CA. MR 1051420I 

[6] Eaton, M. L. (1989). Group Invariance Applications in Statistics. Inst. Math. 
Stat. Hayward, CA. MR1 089423I 

[7] Efron, B. (1998). R. A. Fisher in the 21st century. Ann. Math. Statistical 
Science 13 95-122. [MR1767915I 

[8] FRASER, D. A. S. (1961). On fiducial inference. Ann. Math. Statist. 32 661- 
676. IMR01307551 

[9] Helland, I. (1999). Quantum mechanics from symmetry and statistical mod- 
eling. International Journal of Theoretical Physics 38 1851-1888. MR2039721 

[10] Helland, I. (2003a). Quantum theory as a statistical theory under symmetry 
and complementarity. Submitted. 

[11] Helland, I. (2003b). Extended statistical modeling under symmetry: The 
link towards quantum mechanics. Submitted. 

[12] Helstrom, C. (1976). Quantum Detection and Estimation Theory. Academic 
Press, New York. 

[13] Holevo, A. S. (2001). Statistical Structure of Quantum Theory. Lecture Notes 

in Physics, Vol. 67. Springer- Verlag, Berlin. MR1889193 
[14] KASS, R. and Wasserman, L. (1996). The selection of prior distributions by 

formal rules. Journal of the American Statistical Association 91(435) 1343- 

1370. 

[15] Malley, J. and Hornstein, J. (1993). Quantum statistical inference. Sta- 
tistical Science 8(4) 433-457. IMR1250150I 

[16] Miller, Jr., W. (1972). Symmetry Groups and Their Applications. Academic 
Press, New York. MR0338286 

[17] Naimark, M. A. (1964). Linear Representations of the Lorentz Group. Perg- 
amon Press, Berlin. MR0 170977I 

[18] Parthasarathy, K. R. (1992). An Introduction to Quantum Stochastic Cal- 
culus. Birkhauser- Verlag, Berlin. MR1 164866 

[19] Perelomov, A. (1986). Generalized Coherent States and Their Applications. 
Springer- Verlag, Berlin. 

[20] Stigler, S. M. (1982) Thomas Bayes's Bayesian inference. J. Roy. Statist. 
Soc. A 145 250-258. IMR.0669120I 

[21] Vilenkin, N. J. (1968) Special Functions and the Theory of Group Represen- 
tations. American Mathematical Society, Providence, RI. MR0229863 

[22] Whittle, P. (1992). Probability via Expectation. Springer- Verlag. Berlin. 



